In this exercise, we will be using functions from the
tidyverse package and the broom package. More
attractive tables like those seen in lectures can be produced using
gt.
library(tidyverse)
library(broom)
library(gt)
We’ve seen the data in
airport_screening.csvin the lectures. We also noticed there seemed to be a relationship between number of trips and tendency to bring in Biosecurity Risk Material (BRM), but we haven’t assessed it formally.
Load the data set using
screening <- read_csv("airport_screening.csv")and use thet.testfunction to compare the average number of trips between those people with BRM and those without. Recall that we use they ~ xmodel syntax for two-sample t-tests like this (the data are not paired).Extract out the P-value by creating an object using
tidy(). You could even try formatting it automatically using thescalespackage as demonstrated in lectures.
screening <- read_csv("airport_screening.csv")
Rows: 200 Columns: 24
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (6): sex, passenger_crew, arrival_port, airline, check_in_port, passpor...
dbl (18): BRM, age, month, year, period_stay, number_trips, eggs, other_flow...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
t.test(number_trips ~ BRM, data = screening)
Welch Two Sample t-test
data: number_trips by BRM
t = 4.281, df = 189.34, p-value = 2.954e-05
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
15.68899 42.50207
sample estimates:
mean in group 0 mean in group 1
39.26220 10.16667
test.result <- tidy(t.test(number_trips ~ BRM, data = screening))
test.result$p.value
[1] 2.953688e-05
scales::pvalue(test.result$p.value)
[1] "<0.001"
Use a Fisher’s exact test (
fisher.test()) to compare the rates of BRM for passengers compared to crew.Use a Chi-squared test (
chisq.test()) to compare the rates of BRM for the four passport countries.Use the
group_byandsummarisewithmeanfunctions to print the actual proportions of BRM in the groups, following the example in lectures, to help understand these results.
fisher.test(screening$passenger_crew, screening$BRM)
Fisher's Exact Test for Count Data
data: screening$passenger_crew and screening$BRM
p-value = 0.355
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.3744039 Inf
sample estimates:
odds ratio
Inf
chisq.test(screening$passport_country, screening$BRM)
Warning in chisq.test(screening$passport_country, screening$BRM): Chi-squared
approximation may be incorrect
Pearson's Chi-squared test
data: screening$passport_country and screening$BRM
X-squared = 30.605, df = 3, p-value = 1.029e-06
screening %>%
group_by(passenger_crew) %>%
summarise(mean(BRM))
# A tibble: 2 × 2
passenger_crew `mean(BRM)`
<chr> <dbl>
1 C 0
2 P 0.188
screening %>%
group_by(passport_country) %>%
summarise(mean(BRM))
# A tibble: 4 × 2
passport_country `mean(BRM)`
<chr> <dbl>
1 A 0.0244
2 B 0.145
3 C 0.4
4 D 0.52
© 2021 Statistical Consulting Centre, The University of Melbourne.